good arm
- North America > United States > Wisconsin > Dane County > Madison (0.14)
- North America > United States > Washington > King County > Seattle (0.04)
- North America > Canada (0.04)
- Asia > South Korea > Seoul > Seoul (0.04)
- North America > United States (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.87)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.66)
- Asia > South Korea > Seoul > Seoul (0.04)
- North America > United States (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.67)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.46)
- North America > United States > Wisconsin > Dane County > Madison (0.14)
- North America > United States > Washington > King County > Seattle (0.04)
- North America > Canada (0.04)
An Adaptive Approach for Infinitely Many-armed Bandits under Generalized Rotting Constraints
Kim, Jung-hun, Vojnovic, Milan, Yun, Se-Young
In this study, we consider the infinitely many-armed bandit problems in a rested rotting setting, where the mean reward of an arm may decrease with each pull, while otherwise, it remains unchanged. We explore two scenarios regarding the rotting of rewards: one in which the cumulative amount of rotting is bounded by $V_T$, referred to as the slow-rotting case, and the other in which the cumulative number of rotting instances is bounded by $S_T$, referred to as the abrupt-rotting case. To address the challenge posed by rotting rewards, we introduce an algorithm that utilizes UCB with an adaptive sliding window, designed to manage the bias and variance trade-off arising due to rotting rewards. Our proposed algorithm achieves tight regret bounds for both slow and abrupt rotting scenarios. Lastly, we demonstrate the performance of our algorithm using numerical experiments.
- North America > United States (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > South Korea > Seoul > Seoul (0.04)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.66)
lil'HDoC: An Algorithm for Good Arm Identification under Small Threshold Gap
Tsai, Tzu-Hsien, Tsai, Yun-Da, Lin, Shou-De
Good arm identification (GAI) is a pure-exploration bandit problem in which a single learner outputs an arm as soon as it is identified as a good arm. A good arm is defined as an arm with an expected reward greater than or equal to a given threshold. This paper focuses on the GAI problem under a small threshold gap, which refers to the distance between the expected rewards of arms and the given threshold. We propose a new algorithm called lil'HDoC to significantly improve the total sample complexity of the HDoC algorithm. We demonstrate that the sample complexity of the first $\lambda$ output arm in lil'HDoC is bounded by the original HDoC algorithm, except for one negligible term, when the distance between the expected reward and threshold is small. Extensive experiments confirm that our algorithm outperforms the state-of-the-art algorithms in both synthetic and real-world datasets.
- South America > Paraguay > Asunción > Asunción (0.04)
- North America > United States > Colorado (0.04)
- Asia > Taiwan > Taiwan Province > Taipei (0.04)
- Information Technology > Artificial Intelligence > Machine Learning (0.69)
- Information Technology > Data Science > Data Mining > Big Data (0.51)
An Anytime Algorithm for Good Arm Identification
In good arm identification (GAI), the goal is to identify one arm whose average performance exceeds a given threshold, referred to as good arm, if it exists. Few works have studied GAI in the fixed-budget setting, when the sampling budget is fixed beforehand, or the anytime setting, when a recommendation can be asked at any time. We propose APGAI, an anytime and parameter-free sampling rule for GAI in stochastic bandits. APGAI can be straightforwardly used in fixed-confidence and fixed-budget settings. First, we derive upper bounds on its probability of error at any time. They show that adaptive strategies are more efficient in detecting the absence of good arms than uniform sampling. Second, when APGAI is combined with a stopping rule, we prove upper bounds on the expected sampling complexity, holding at any confidence level. Finally, we show good empirical performance of APGAI on synthetic and real-world data. Our work offers an extensive overview of the GAI problem in all settings.
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.92)
Differential Good Arm Identification
Tsai, Yun-Da, Tsai, Tzu-Hsien, Lin, Shou-De
This paper targets a variant of the stochastic multi-armed bandit problem called good arm identification (GAI). GAI is a pure-exploration bandit problem with the goal to output as many good arms using as few samples as possible, where a good arm is defined as an arm whose expected reward is greater than a given threshold. In this work, we propose DGAI - a differentiable good arm identification algorithm to improve the sample complexity of the state-of-the-art HDoC algorithm in a data-driven fashion. We also showed that the DGAI can further boost the performance of a general multi-arm bandit (MAB) problem given a threshold as a prior knowledge to the arm set. Extensive experiments confirm that our algorithm outperform the baseline algorithms significantly in both synthetic and real world datasets for both GAI and MAB tasks.
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Data Science > Data Mining > Big Data (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
Rotting infinitely many-armed bandits
Kim, Jung-hun, Vojnovic, Milan, Yun, Se-Young
We consider the infinitely many-armed bandit problem with rotting rewards, where the mean reward of an arm decreases at each pull of the arm according to an arbitrary trend with maximum rotting rate $\varrho=o(1)$. We show that this learning problem has an $\Omega(\max\{\varrho^{1/3}T,\sqrt{T}\})$ worst-case regret lower bound where $T$ is the horizon time. We show that a matching upper bound $\tilde{O}(\max\{\varrho^{1/3}T,\sqrt{T}\})$, up to a poly-logarithmic factor, can be achieved by an algorithm that uses a UCB index for each arm and a threshold value to decide whether to continue pulling an arm or remove the arm from further consideration, when the algorithm knows the value of the maximum rotting rate $\varrho$. We also show that an $\tilde{O}(\max\{\varrho^{1/3}T,T^{3/4}\})$ regret upper bound can be achieved by an algorithm that does not know the value of $\varrho$, by using an adaptive UCB index along with an adaptive threshold value.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > Finland > Uusimaa > Helsinki (0.04)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.67)
Finding All {\epsilon}-Good Arms in Stochastic Bandits
Mason, Blake, Jain, Lalit, Tripathy, Ardhendu, Nowak, Robert
The pure-exploration problem in stochastic multi-armed bandits aims to find one or more arms with the largest (or near largest) means. Examples include finding an {\epsilon}-good arm, best-arm identification, top-k arm identification, and finding all arms with means above a specified threshold. However, the problem of finding all {\epsilon}-good arms has been overlooked in past work, although arguably this may be the most natural objective in many applications. For example, a virologist may conduct preliminary laboratory experiments on a large candidate set of treatments and move all {\epsilon}-good treatments into more expensive clinical trials. Since the ultimate clinical efficacy is uncertain, it is important to identify all {\epsilon}-good candidates. Mathematically, the all-{\epsilon}-good arm identification problem presents significant new challenges and surprises that do not arise in the pure-exploration objectives studied in the past. We introduce two algorithms to overcome these and demonstrate their great empirical performance on a large-scale crowd-sourced dataset of 2.2M ratings collected by the New Yorker Caption Contest as well as a dataset testing hundreds of possible cancer drugs.
- North America > United States > New York (0.24)
- North America > United States > Wisconsin > Dane County > Madison (0.14)
- North America > United States > Washington > King County > Seattle (0.04)
- Research Report > New Finding (0.47)
- Research Report > Experimental Study (0.33)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Therapeutic Area > Oncology (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.48)
- Information Technology > Communications > Social Media > Crowdsourcing (0.34)